RED: Reinforced Encoder-Decoder Networks for Action Anticipation
نویسندگان
چکیده
Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame’s representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets.
منابع مشابه
On the decoding delay of encoders for input-constrained channels
Finite-state encoders that encode n-ary data into a constrained system S are considered. The anticipation, or decoding delay, of such an (S, n)-encoder is the number of symbols that a state-dependent decoder needs to look ahead in order to recover the current input symbol. Upper bounds are obtained on the smallest attainable number of states of any (S, n)-encoder with anticipation t. Those boun...
متن کاملAnticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction
Close Human-robot cooperation is a key enabler for new developments in advanced manufacturing and assistive applications. Close cooperation require robots that can predict human actions and intent, and understand human non-verbal cues. Recent approaches based on neural networks have led to encouraging results in the human action prediction problem both in continuous and discrete spaces. Our app...
متن کاملDecoding Coattention Encodings for Question Answering
An encoder-decoder architecture with recurrent neural networks in both the encoder and decoder is a standard approach to the question-answering problem (finding answers to a given question in a piece of text). The Dynamic Coattention[1] encoder is a highly effective encoder for the problem; we evaluated the effectiveness of different decoder when paired with the Dynamic Coattention encoder. We ...
متن کاملFoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
Recent deep networks that directly handle points in a point set, e.g., PointNet, have been state-of-the-art for supervised learning tasks on point clouds such as classification and segmentation. In this work, a novel end-toend deep auto-encoder is proposed to address unsupervised learning challenges on point clouds. On the encoder side, a graph-based enhancement is enforced to promote local str...
متن کاملDecoupling Encoder and Decoder Networks for Abstractive Document Summarization
Abstractive document summarization seeks to automatically generate a summary for a document, based on some abstract “understanding” of the original document. State-of-the-art techniques traditionally use attentive encoder–decoder architectures. However, due to the large number of parameters in these models, they require large training datasets and long training times. In this paper, we propose ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.04818 شماره
صفحات -
تاریخ انتشار 2017